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In ophthalmology, deep learning acts as a computer-based tool with 
numerous potential capabilities and efficacy. Throughout the world, diabetic 
retinopathy (DR) is considered as a principal cause of disease however loss 
of sight cannot be seen in adults aged 20-74 years. The primary objective for 
early detection of DR is screening on a regular basis at separate intervals 


which should have a time difference of every ten to twenty months for the 
patients with no or mild case of DR. Regular screening plays a major role to 
prevent vision loss, the expected cases increase from 415 million in 2015 to 
642 million in 2040 means is a challenging task of ophthalmologists to do 
screening and follow-up representations. In this research, a transfer learning 
model was proposed with data augmentation techniques and gaussian-blur, 
circle-crop pre-processing techniques combination to identify every stage of 
DR using Resnet 50 with top layers. Models are prepared with Kaggle Asia 
Pacific Tele-Ophthalmology Society blindness dataset on a top line 
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Resnet50 graphical processing data. The result depicts- the comparison of 
classification metrics using synthetic and non-synthetic images and achieve 
accuracy of 91% using the synthetic data and 86% accuracy without using 
synthetic data. 
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1. INTRODUCTION 

Throughout the world, Eye examinations are extremely significant for early detection so that the 
chances of effective treatment can be improved where fundus cameras are used to capture retinal images. 
diabetic eye disease (DED) comprises a bunch of several ocular conditions including glaucoma, diabetic 
macular edema, and diabetic retinopathy (DR). All sorts of DED lead to major loss of sight which ultimately 
results in vision impairment in patients aged from 20-74. By 2045, this problem is expected to rise to 690 
million. They key consequences of diabetes can be seen in various parts in the body which includes retina as 
well. The onset of Severe DED takes place with an unusual progression of blood vessels, deterioration of the 
optic nerve and the generation of hard exudates near the macula (central part of retina) region. It is 
considered that four types of DED are a threat to eye vision and they are explained in a nutshell in the given 
subsection. The identification of DR, a diabetic complication affecting eyes is performed by observing the 
damage to the blood vessels of retina at the fundus of the eye. Retina in the eye performs a function to sense 
light & send the information with a sign to the brain. Then, the brain is responsible for decoding those signals 
so that one can see the objects. 

Stages of DR based on severity features. DR has been classified into various stages based on its 
complications in [1]—[3]. The levels showed in Figure | are as : 
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Level 0 : This is not a severe case and doesn't need apparent retinopathy. 

Level 1 : MILD NDPR. The patients have at least one microaneurysms with or without the presence of 
other lesions. 

Level 2 : Moderate NPDR. The patients have haemorrhages and consists of many microaneurysms. 

Level 3 : Severe NPDR. i) The patients have haemorrhages & multiple microaneurysms in four sections 


or quadrants of retina; ii) cottonwood spots in 2 or more quadrants; and iii) Intraretinal 
microvascular abnormalities in 1 quadrant at least. 

Level 4 :Proliferative DR. patients suffer from advanced stage of NDPR. At this stage, due to the 
neovascularisation higher risk of leakage can cause severe loss of sight. This might lead to 
blindness. 


Figure 1. DR severity levels (0,1,2,3,4) left to right 


The classification and detection of DR process requires more time, and its time is important when 
the patient case is severe. Instead of Machine Learning techniques like [4], need an automation system is 
needed to identify the stages of DR efficiently. For building such system a few papers have been researched 
to get a better understanding about convolutional neural network (CNN). A survey of research done on nearly 
huge papers on the diagnosis of DR, it consists of various methods used to detect retinopathy. 

Pratt et al. [5] and Abramoff et al. [6] prepared a CNN based network with data augmentation 
technique. This can detect the complex features engaged in the task of classification like micro-aneurysms, 
several haemorrhages in the retina, exudates. This accordingly diagnose on its own without the user input. 
They obtained 95% sensitivity, 75% precision on five thousand validation images. Moreover, there are 
several other research works done on CNNs by many other renowned researchers as in [7], [8]. Many other 
research works have been done to make transfer learning CNN architecture based on as methods in [9] 
attempted to prepare Inception Net V3 for a five-class categorization with pretrain on the ImageNet dataset 
and obtained a 90.0% precision, Zhang ef al. [10] attempted to train ResNet50, Xception Nets, DenseNets 
and visual geometry group (VGG) with ImageNet pretrain and obtained great precision of around 81.3%. 
Both research teams utilized the datasets which were offered APTOS & Kaggle as in suggested a method 
with architecture of CNN and data augmentation. 

Lam et al. [11] The duckworth—lewis (DL) method was qualified with the utilization of the 
Inception architectural model & a big data greater than 1.6 million retinal pictures then modified to a set of 
2,000 images which included the labels approved by three ophthalmologists as a reference standard. 
Altogether, this model successfully exhibited a 5-class accuracy of 88.4%. with a precision of 96.9% for 
images with no DR and 57.9% for images with mild or severe non-proliferative DR. Xu et al. [12] the 
suggested method based upon CNN network using VGG network architecture have been trained with back 
propagation neural network (NN), deep neural network (DNN) and CNN. Dutta et al. [13] a multi-cell multi- 
task CNN (MCNN) is proposed. Gulshan ef al. [14] the method is used pretrained transfer learning on 
AlexNet and GoogleNet models from Imagenet. In [15], [16] is another model for classifying the DR with 5 
classes of DR. EyePACS dataset is used for training set of 35,126 images and test set of 53,576 images. The 
proposed DR classifier got sensitivity and specificity of 90% to detect the severity levels of DR disease. 

The traditional diagnosis of DR is done by taking the retinal images and studying for the signs of the 
disease on those images which are collected. Moreover, there is a heavy expenditure of the fundus imaging 
devices and its installation in the healthcare centers. Due to the lack of ophthalmologists, healthcare 
professionals and establishments globally, research has also been conducted to implement the mobile based 
diagnosis services of DR. With the advancement in technology, Numerous researchers have been able to 
develop several image restorations, image enhancement and the layouts or building designs of Image deep 
learning particularly CNNs along with the classification layers at the last. 

The transfer learning techniques [17] are widely accepted and are in demand due to the shortage of 
labelled training data in designing and training of deep CNN models [18]. In the healthcare sector associated 
with technology, the major hassle or issue in the application of deep learning models is the inadequacy of 
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annotated training data. Transfer learning is the use of pre-existing trained neural network models in 
categorising the previously dataset that is unseen. This can be extremely significant in the classification of 
medical image as we all know that the healthcare of computed image classification goes through the 
obstacles from annotated training data inadequacy. Some of the existing transfer learning model accuracies 
shown in Table.1. 


Table 1. Existing transfer learning model accuracies 


Model Name Dataset Accuracy 
Resnet50 E-Ophtha 90% 
VGG16 E-Ophtha 90% 
InceptionV3 E-Ophtha 89% 
Resnet 50 APTOS Blindness 2019 90% 
Resnet50 DIARETDB1 89.2% 
VGG16 DIARETDB1 88.7% 
InceptionV3 DR detection 2015 89.8% 
Resnet50/ Densenet/ Inception V3 DR detection 2015 <90% 


Various international studies depict that the algorithms developed used the limited clinical datasets 
and are not annotated by expert ophthalmologists. Moreover, standardized balanced dataset is not available in 
the non-clinical environment for different specific diseases and hence not exactly identifying the prevalence 
of the eye disease with available algorithms. In this paper, we investigate the problem of lack of medical 
images such as eye diseases for image classification, generated images using traditional data augmentation 
techniques and compare the classification model performance metrics with and without using synthetic 
images. The model based on CNN for image classification. 


2. RESEARCH METHOD 
2.1. Datasets 

The increasingly used datasets used for the detecting DR are Kaggle [19] and Messidor as in [20]. 
Authors in have used Kaggle data however have used Messidor data. The Kaggle dataset comprises of 88,702 
images, of which 35,126 are utilized for training purpose and 53,576 have a usage for testing. 

Messidor is the most used dataset that contain 1,200 fundus images. The Kaggle and Messidor dataset, 
is labelled for the stages of DR. In the proposed model, APTOS 2019 Blindness detection dataset [21] 
(APTOS, 2019) taken. The full dataset consists of 18590 fundus photographs, which are divided into 3662 
training, 1928 validation, and 13000 testing images by organizers of Kaggle competition. All datasets have 
similar distributions of classes; train and test data distribution for APTOS2019 is shown in Figure 2 and 3. 


Output Class Distribution TRAIN DATA 
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Figure 2. 80% of training data taken as train data 


2.2. Data augmentation 
When there is an imbalance in an image (as usually seen in the realistic settings), The technique of 
image data augmentation is applied. Mirroring, rotation, resizing, and cropping of the images are done to 
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bring out the cases of the chosen images for a class wherein the number of images is lesser than another large 
section of healthy images of retina as opposed to the DED retina images. The technique of augmentation as 
in [22]-[27] is a common approach for improving the outcomes and to prevent overfitting. Moreover, after 
observing, Kaggle dataset distribution is not even. Now, we also used the Kaggle APTOS -Blindness dataset 
which comprises of 13,000 colour fundus images approximately, each one of dimension of 3216x2136 
pixels, displayed in Figure 4(a). When a deep network is trained with dataset, it results in biasness of 
classification. In the first step of data augmentation, resizing of each image to 224x224 is done, the resizing 
helps to maintain the initial or the original aspect ratio as presented in Figure 4(b). Now, we can also look at 
the augmented images as presented in Figure 4(c). These methodsincrease the dataset size, balance the 
samples in each class and prevent overfilling. During the procedure of training, the validation set is used to 
check and decrease the errors such as overfitting. Presenting results of the sample images of several data 
augmentation techniques: 


Output Class Distribution VALIDATION DATA 


1 2 3 a 


diagnosis 


ww 
Cc 
S 
8 


Figure 3. 20% of training data taken as validation data 


(b) 


(c) 


Figure 4. Sample images after applying data augmentation techniques for (a) original image; (b) resized to 
224x224, 256x256, 299x299 and 512x512; and (c) Augmented images using rotations 


2.3. Pre-processing 

For the improvement of the images, Image pre-processing steps can be applied. For adjusting the 
images and making them clearer so that the model to learn features with efficacy can be enabled, we will use 
some techniques of image processing using the OpenCV library in python (cv2). Moreover, gaussian blur can 
also be used to produce the different features in the images. The convolution of image with the gaussian filter 
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takes place in the gaussian blur operation [28]. The gaussian filter is an added filter which is responsible for 
removing the components that are high in frequency. 

This is a noise removal filter such as gaussian filtering is then used which smoothens up the image 
or reduces its details. The use of these techniques results in an image which has a low resolution in 
accordance with the system. Next apply circle crop operation on the resultant images. This function is 
identifying the circle part of the image. Figure 5 shows the sample images after performing pre-processing 
operations on APTOS-Blindness dataset. 


Sample Data Points Sample Data Points 


Sample Data Points 


Figure 5. showing sample images after pre-processing step 


2.4. Multiprocessing and resizing images into directory 

This filter is also responsible for the resizing of the image and multi-processing as well to save in 
the directory: we can save the fresh new images after the process of pre-processing. The size if original 
images is about 3 MB and the whole folder of the image has 20 GB of total space. This size can be decreased 
by using the resizing method. With the procedure of multi core threading, this task can be done very quickly 
in just one minute only. Usage of Thread Pool with six cores (as for executing, I used eight core CPU) to 
obtain this and a 512x512 sized image to achieve IMG_SIZE. 


2.5. Renet 50 model with top layers 

The pre-trained networks are indicating a great capability for the generalization of the images 
outside of the ImageNet dataset through the means of transfer learning. By the process of fine tuning, manual 
changes can be done in the pre-existent model. Suppose that the already trained network is trained properly, 
we don’t wish to change the weights to more extent. For this modification or change, a learning rate is used 
which is generally smaller than the one utilized for training the model in the start. A pre-trained model is 
used for a partial training of the suggested model. The weights of the starting layers of the model are kept 
frozen by the model while it can only retrain the other higher layers as shown in Figure 6. Therefore, the 
number of layers can be be tried and tested which are to be frozen and trained accordingly. The size of the 
APTOS Blindness dataset is small. Not only this, but the similarity of data is also considerably low. In this 
case as well, the starting layers freeze however the other layers are left over and retrained. The customization 
of the starting layers takes place to a new dataset. It is extremely important for the customization of the 
higher layers in accordance with the new dataset as the new dataset, now has a reduced similarity. Now, the 
smaller sized datasets get compensated as the starting layers are pretrained (which have already been trained 
previously on a big dataset) and their weights are already frozen. The research algorithm as shown in 
Algorithm 1. 
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Algorithm 1 Research algorithm 
Input: fundus images (A,B); where B = {b/b € {no case of DR, mild DR, moderate DR, severe DR, PDR}} 
Output: a trained model which categorizes the fundus images a ¢ A 
(a) Pre-Processing and Augmentation: 
(i) input image resized to 512x512 
(ii) Augmentation step: randomly cropping blotches of the 
512 x 512 size of every retinal image and 90/180/270 
degrees of rotation. 
(iii) apply pre-processing techniques gaussian-blur and 
circle-crop. Resultant images stored in drive folder. 
(b) Apply resnet50 with top layers on resultant images. 
H = {ResNet50} 
Substitute the last wholly connected layer of every 
model by a layer of (5 x1) attribute 
for each V xeX do 
for epochs =1 to 50 do 
a = 0.0002 
if the validation flaw or error not 
improving accuracy set reducing the 
learning rate i.e, a= 0.0002 
print the accuracy if it is getting constant for long 
time. 
End 
End 
(c) Analyze the classification accuracy using ROC curve, 
print the metrics. 
(d) Testing 
foreachae A tey do 
predict the output, print confusion matrix. 
End 


Fully connected layers 


ae 


Generic knowledge layers 
Task specific knowledge layers 


Figure 6. Resnet 50 with top layers 


3. RESULTS AND DISCUSSION 

In this portion, by using APTOS-Blindness dataset, the outcomes or the results of the model are 
examined and were trained with the application of a top-line graphics processing unit (GPU) (NVIDIA 
GeForce GTX 1070 Laptop) with the CUDA DNN library. In addition to this, the TensorFlow and Keras 
were applied (Keras, a packagea for deep machine learning and TensorFlow as a backend for machine 
learning). This section aims to investigate the impact of data augmentation on the performance of 
classification models. distinctively, model performance can be compared with and without synthetic images 
of the online data augmentation techniques in the training data) A CNN model was implemented for 
classification experiments [29], [30] which are composed of four convolution layers. Each convolution layer 
was followed by a max-pooling operation. The model includes two fully connected layers. A rectified linear 
unit (ReLU) was used as the activation function in all layers. The dataset was segregated into training and 
test sets based on three-fold cross-validation. The experiments included two scenarios. The model was 
trained without the inclusion of synthetic images. On contrary, the model was re-trained after the inclusion of 
the generated images using the online augmentation techniques in the training set. However, the test set 
al.ways included samples from the original dataset in both scenarios. The model was trained on the Kaggle 
APTOS-Blindness dataset for 50 epochs. The learning rate was set for initial epochs at 10° and if no 
improvement in accuracy it will automatically reduce to 107 and the model got best accuracy of 91.1 at 
learning rate 10°!. For starting layers in resnet50 model, layer. trainable is False and then next layers i.e, 
layer.trainable is True. Adam optimizer is used for this model. The two classification model accuracies were 
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analyzed based on receiver operating characteristics (ROC) curve. This ROC curve maps the relationship 
between the false positive rate on x-axis and true positive rate on y-axis across a full range of possible 
thresholds. The classification accuracy of the original dataset without using synthetic data is 86% and with 
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synthetic dataset is 91.1%. Figures 7. Classification report of the model without data augmentation 
Figure 7(a) represents the classification metrics of the model, Figure 7(b) represents the Confusion matrix of 


ROC curve analysis of the model, and Figure 7(c) represents the ROC curve in the baseline case (i.e., without 


synthetic data). 
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Figure 7. Classification report of the model without dataaugmentation for (a) classification metrics of the 
model; (b) Confusion matrix of ROC curve analysis of the model; and (c) ROC curve in the baseline case 


The ROC curve graphs demonstrate that the performance of the model constantly improved after 
adding the synthetic images to the original dataset. The results showed that the overall accuracy of the 
classification model was improved by 11% approximately. Here, the aim was to correctly categorise and 
identify all DR stages most importantly the initial DR stages. Figures 8. Classification report of the model 
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with traditional dataaugmentation Figure 8(a) represents the classification metrics of the model, Figure 8(b) 
represents the Confusion matrix of ROC curve analysis of the model and Figure 8(c) represents the ROC 
curve in case of applying the traditional-based data augmentation methods. 
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Figure 8. Classification report of the model with traditional dataaugmentation (a) classification metrics of the 
model; (b) Confusion matrix of ROC curve analysis of the model; and (c) ROC curve in the baseline case 
(i.e., with synthetic data) 


4. CONCLUSION 

In our modern-day technology, Deep learning tools open a wide range of possibilities to develop 
effective models which can render better results. These deep learning tools have a potential utility in 
ophthalmology. Since we all know that diabetes is a rapidly growing disease and affect our body severely. 
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The diagnosis of the disease by any manual means seems to be tiresome and usually results in errors. 
Therefore, the computational tools are developed in a way that automatically diagnose that is discussed in the 
literature. In the present study, we have managed to showcase an automatic deep learning model to identify 
the different stages of DR using APTOS-Blindness dataset. The proposed model is ResNet50 with synthetic 
data can provide strength to the classifying model and improves its capability. The model showed more 
accuracy with 91.1%. The result depicts- the comparison of classification metrics using synthetic and non 
synthetic images. The model compares and identifies all the stages of DR unlike the present methods and 
achieve accuracy of 91% using the synthetic data and 86% accuracy without using synthetic data. On the top 
of that, we also intend to have training specified models for stages so that the accuracy of the initial stages 
can be increased or improved. 
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